{
 "cells": [
  {
   "cell_type": "markdown",
   "id": "f9b703ee",
   "metadata": {},
   "source": [
    "# LlamaIndex <> Vellum Integration\n",
    "\n",
    "This demo notebook shows how you can use [Vellum](https://www.vellum.ai/) to manage the prompts you use within LlamaIndex. By registering LlamaIndex prompts with Vellum via the `VellumPredictor` class, you're able to use Vellum to:\n",
    "1. View and manage all your prompts in a UI\n",
    "2. Compare and constrast prompt templates, and even different LLM providers/models side-by-side\n",
    "3. Update your prompt, or the provider/model backing it, without updating or re-deploying code\n",
    "4. Set up unit tests to test your prompt and quantitatively evaluate the quality of its outputs\n",
    "5. Observe and monitor all traffic to it, including the inputs sent to the model, and the outputs sent back\n",
    "\n",
    "## Prerequisites\n",
    "\n",
    "1. Sign up for a free Vellum account at [app.vellum.ai/signup](https://app.vellum.ai/signup)\n",
    "2. Go to [app.vellum.ai/api-keys](https://app.vellum.ai/api-keys) and generate a Vellum API key. Note it down somewhere."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "9c2a6288",
   "metadata": {},
   "outputs": [],
   "source": [
    "!pip install vellum-ai"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "ca5638a1",
   "metadata": {},
   "outputs": [],
   "source": [
    "from llama_index import VellumPredictor, VellumPromptRegistry"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "f5ea8e54",
   "metadata": {},
   "outputs": [],
   "source": [
    "VELLUM_API_KEY = input()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "91df00bd",
   "metadata": {},
   "outputs": [],
   "source": [
    "predictor = VellumPredictor(vellum_api_key=VELLUM_API_KEY)\n",
    "prompt_registry = VellumPromptRegistry(vellum_api_key=VELLUM_API_KEY)"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "e6103746",
   "metadata": {},
   "source": [
    "## Auto-Register Prompts & Make Predictions Through Vellum\n",
    "\n",
    "We can use the `VellumPredictor` class to auto-register a prompt with Vellum as part of making predictions.\n",
    "\n",
    "Let's choose a prompt to experiment with."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "faabc643",
   "metadata": {},
   "outputs": [],
   "source": [
    "from llama_index.prompts.default_prompts import DEFAULT_TEXT_QA_PROMPT\n",
    "\n",
    "prompt = DEFAULT_TEXT_QA_PROMPT"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "8b0de89b",
   "metadata": {},
   "outputs": [],
   "source": [
    "completion, formatted_prompt = predictor.predict(\n",
    "    prompt, context_str=\"The earth is flat\", query_str=\"Is the earth round or flat?\"\n",
    ")\n",
    "\n",
    "print(completion)"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "faa56b44",
   "metadata": {},
   "source": [
    "Under the hood, `VellumPredictor` noticed that no prompt of this prompt type (in this case, `PromptType.QUESTION_ANSWER`) had been registered yet, and so it auto-registered it.\n",
    "\n",
    "By registering a prompt with Vellum, Vellum will create:\n",
    "1. A \"Sandbox\" – an environment where you can iterate on the prompt, it's model, provider, params, etc.; and\n",
    "2. A \"Deployment\" – a thin API proxy between you and LLM providers and offering prompt versioning, request monitoring, and more\n",
    "\n",
    "We can use `VellumPromptRegistry` to retrieve information about the registered prompt and get links to open its corresponding Sandbox and Deployment in Vellum's UI."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "859409d5",
   "metadata": {},
   "outputs": [],
   "source": [
    "registered_prompt = prompt_registry.from_prompt(prompt)\n",
    "\n",
    "print(registered_prompt.sandbox_url)\n",
    "print(registered_prompt.deployment_url)"
   ]
  },
  {
   "attachments": {},
   "cell_type": "markdown",
   "id": "d97d8d52",
   "metadata": {},
   "source": [
    "## Exploring Vellum Sandboxes\n",
    "Now that we have a Sandbox in Vellum, we can iterate on our prompt to see if we can get even better results. Let's see how `gpt-3.5-turbo` compares to `text-davinci-003` and `gpt-4` using the same prompt template.\n",
    "\n",
    "![Vellum Sandbox](images/vellum-sandbox.png)\n",
    "\n",
    "It's easy to compare and see that:\n",
    "1. ChatGPT provides a technically correct response, but maybe offers more explanation than we'd like\n",
    "2. GPT-3 is just wrong according to our context\n",
    "3. GPT-4 provides arguably the best response\n",
    "\n",
    "You can even compare models from accross different providers - try it out! Note though that you'll need an API key for each.\n",
    "\n",
    "Tip: Find your Sandbox with `print(registered_prompt.sandbox_url)`"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "69248471",
   "metadata": {},
   "source": [
    "## Exploring Vellum Deployments\n",
    "Vellum Deployments are meant to serve traffic to a prompt/model once it's stabilized. Deployments offer monitoring of all requests made against them. You can update the Deployment's prompt, model, and parameters in Vellum without making any code changes. These changes get version controlled and can be reverted at any time.\n",
    "\n",
    "When you call `VellumPredictor.predict` in LlamaIndex, it's proxying the LLM call through your Vellum Deployment.\n",
    "\n",
    "![Vellum Deployment](images/vellum-deployment.png)\n",
    "\n",
    "![Vellum Deployment Completions](images/vellum-deployment-completions.png)\n",
    "\n",
    "\n",
    "Tip: Find your Deployment with `print(registered_prompt.deployment_url)`"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "5fdafaf6",
   "metadata": {},
   "source": [
    "## Going Further\n",
    "\n",
    "We mentioned above that the LlamaIndex <> Vellum integration with auto-register LlamaIndex prompts in Vellum – one per `PromptType`. However, you may want to:\n",
    "1. Reference an already existing Deployment that you manually created in Vellum; and/or\n",
    "2. Auto-register more than one Sandbox/Deployment per `PromptType`. Perhaps you want to be able to differentiate monitoring data across two separate product use-cases that use the same `PromptType`, or maybe you want to use different model/prompt template for the same `PromptType`.\n",
    "\n",
    "### Referencing Existing Deployments\n",
    "Vellum looks for the optionally provided `Prompt.metadata[\"vellum_deployment_id\"]` and/or `Prompt.metadata[\"vellum_deployment_name\"]`. Both are uniquely identifying.\n",
    "\n",
    "If either are provided upon instantiating the Prompt, or by updating its metadata after the fact, Vellum will detect the existing Deployment and use it rather than creating a new one.\n",
    "\n",
    "### Auto-Registering Multiple Prompts Per Type\n",
    "Simlilar to above, you can use `Prompt.metadata[\"vellum_deployment_name\"]` to specify the name of the Deployment you'd like to use. If none exists by that name, the prompt will be auto-registered and a new Deployment will be created.\n",
    "\n",
    "The value of `vellum_deployment_name` should be a kebab-case string (e.g. `\"my-deployment\"` not `\"MY_DEPLOYMENT\"`).\n",
    "\n",
    "You may optionally also provide `vellum_deployment_label` to give it a human-friendly name. If none is provided, a default will be generated.\n",
    "\n",
    "### What Else?\n",
    "So far we've seen how we can use the LlamaIndex <> Vellum integration to make it easy to experiment with prompts and monitor their inputs/outputs. However, Vellum offers a full suite of tools to help bring AI-powered apps to production, including tools for quantitatively testing prompts, performing semantic search, and more. You can learn about Vellum's other capabailities here: https://www.vellum.ai/products"
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "venv",
   "language": "python",
   "name": "venv"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.8.16"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 5
}
